Skip to content

Add jupyter-fs integration with projspec chips and scan-url backend#2

Open
ktaletsk wants to merge 9 commits intomainfrom
feature/jupyter-fs-support
Open

Add jupyter-fs integration with projspec chips and scan-url backend#2
ktaletsk wants to merge 9 commits intomainfrom
feature/jupyter-fs-support

Conversation

@ktaletsk
Copy link
Copy Markdown
Collaborator

@ktaletsk ktaletsk commented Feb 25, 2026

Motivation

Project discovery only worked with the local file browser. Many workflows involve remote filesystems (S3, Samba, SFTP) accessed through jupyter-fs. This PR extends projspec to work with jupyter-fs resources.

image

What changes

  • Chips appear automatically in each jupyter-fs sidebar, updating as you navigate directories
  • The right-sidebar panel stays in sync with whichever file browser tab is active (local or jfs)
  • New /scan-url backend endpoint scans remote filesystems via fsspec URLs, validated against configured jupyter-fs resources
  • Path traversal protection and credential redaction in logs
  • ScanSource discriminated union (local | jfs) separates the two scan paths in the frontend
  • Degrades silently when jupyter-fs is not installed

Test plan

  • Open JupyterLab with a jupyter-fs resource configured
  • Chips appear in the jupyter-fs sidebar at root
  • Navigate into a subdirectory — chips update
  • Navigate back via breadcrumbs — chips update
  • Switch between native file browser and jupyter-fs tab — panel follows
  • Click a chip in the jupyter-fs sidebar — panel opens the spec
  • Without jupyter-fs installed, extension works normally
  • pytest jupyter_projspec/tests/test_routes.py

- Bump package version in package.json.
- Add jupyter-fs integration details to README, including automatic detection and per-resource scanning.
- Implement new API endpoint for scanning fsspec URLs in routes.py.
- Introduce JfsChipsWidget for displaying projspec chips in jupyter-fs sidebars.
- Update ProjspecPanel and related components to support jupyter-fs resources.
- Modify various components to handle optional paths and improve error handling for remote filesystems.
…t scanning

- Introduce a new helper function _scan_url to run projspec.Project in a worker thread.
- Update the post method in ScanUrlRouteHandler to use async/await for improved I/O handling.
- Ensure project data is returned as JSON after scanning the fsspec URL.
- Add validation to ensure 'url' is a string and 'subpath' is either a string or null, returning a 400 error for invalid inputs.
- Implement a mechanism to use a server-configured allowed URL for path construction, discarding any client-supplied query parameters.
- Introduce a new helper function _redact_url_credentials to safely log URLs by redacting embedded passwords.
- Update _normalize_url to ensure consistent URL comparison by stripping query parameters and normalizing paths.
- Add unit tests for URL normalization, allowlist checking, and credential redaction to improve code coverage and reliability.
- Add validation in ScanUrlRouteHandler to return a 422 error when no jupyter-fs resources are configured.
- Improve URL normalization by ensuring percent-encoded characters in netloc and paths are decoded.
- Update _redact_url_credentials to handle various URL formats for better security logging.
- Introduce comprehensive unit tests for resource extraction and URL handling, including edge cases for missing fields and percent-encoded paths.
- Enhance error logging in fetchJfsResources to provide clearer feedback on network and parsing issues.
- Updated ScanUrlRouteHandler to avoid unquoting subpath, preventing potential double-decoding attacks.
- Added a unit test to ensure that double-encoded traversal attempts are correctly blocked, maintaining security against path traversal vulnerabilities.
- Improved handling of subpath validation to reject malformed inputs without crashing the application.
- Implement a two-layer traversal check to block both raw and single-encoded traversal attempts, improving security against path traversal vulnerabilities.
- Update unit tests to verify that single-encoded dot segments are correctly identified and rejected, while allowing legitimate folder names containing '%' characters.
- Refactor subpath normalization to ensure consistent handling of valid inputs.
- Deleted a test case that checked for a 400 response when a subpath normalizes to '.', as it was deemed unnecessary.
- This change streamlines the test suite while maintaining coverage for critical validation scenarios.
The MutationObserver was dropping its sidebar-level observation when
narrowing to the breadcrumb element. When tree-finder replaced the
breadcrumb during navigation (e.g. clicking root after visiting a
subdirectory), the observer missed the replacement because it was only
watching the now-detached old element.

Keep the sidebar observation permanently active so structural changes
are always detected, and defer the breadcrumb re-read by one task to
let tree-finder finish rendering the replacement element.

Made-with: Cursor
@github-actions
Copy link
Copy Markdown

Binder 👈 Launch a Binder on branch fsspec/jupyter-projspec/feature%2Fjupyter-fs-support

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Extends the projspec JupyterLab extension to support project discovery on remote/virtual filesystems surfaced via jupyter-fs, keeping chips/panel in sync across local and jupyter-fs file browser tabs and adding a backend endpoint to scan fsspec URLs safely.

Changes:

  • Add a new backend /jupyter-projspec/scan-url endpoint to scan allowed jupyter-fs resources (with traversal protection + credential redaction).
  • Introduce a ScanSource union (local | jfs) and update panel/chips to scan either local paths or jupyter-fs URLs.
  • Inject projspec chips into jupyter-fs sidebars and sync the right panel to the active left sidebar tab.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
style/base.css Adds empty-state styling and new CSS hooks for jupyter-fs sidebar chip injection.
src/widgets/ProjspecPanel.ts Switches panel state from a path string to a `ScanSource
src/widgets/JfsChipsWidget.ts New widget that renders chips in jupyter-fs sidebars and tracks breadcrumb navigation via MutationObserver.
src/types.ts Adds ScanSource union + helpers for equality, display, endpoint selection, and request init building.
src/tokens.ts Introduces a JupyterLab token to share panel state/mappings between plugins with explicit activation ordering.
src/index.ts Splits into main + jupyter-fs integration plugin; injects chips into jupyter-fs sidebars and syncs panel to active tab.
src/components/SpecItem.tsx Updates path prop to nullable to disable make-related UI for non-local sources.
src/components/ProjspecPanelComponent.tsx Refactors scanning to use ScanSource and adds a null-source empty state (no scan).
src/components/ProjspecChips.tsx Adds optional scanUrl prop and routes scans through scan-url POST for jupyter-fs.
src/components/ProjectView.tsx Propagates nullable path down to spec items.
src/components/ArtifactsView.tsx Makes “make” actions conditional on path !== null and prevents calls when unavailable.
src/api.ts Adds client helper to fetch jupyter-fs resources from /jupyterfs/resources and compute sidebar IDs.
package.json Bumps extension version to 0.3.0.
jupyter_projspec/tests/test_routes.py Adds extensive unit/integration test coverage for URL normalization, allowlisting, traversal protection, and scan-url validation.
jupyter_projspec/routes.py Adds /scan-url handler with allowlist validation, traversal protection, URL normalization, and credential redaction.
README.md Documents jupyter-fs integration and the new scan-url endpoint.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

}
}
);

Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

injectChips creates a container + attaches a JfsChipsWidget, but there’s no disposal/cleanup path (e.g., remove the injected container and delete sidebarId from injected/sidebarIdToUrl) if the sidebar widget is later disposed/recreated. Adding a chipsWidget.disposed.connect(...) cleanup like the local file browser chips uses would prevent orphaned DOM nodes and allow reinjection after re-creation.

Suggested change
chipsWidget.disposed.connect(() => {
injected.delete(sidebarId);
sidebarIdToUrl.delete(sidebarId);
if (container.parentNode) {
container.parentNode.removeChild(container);
}
});

Copilot uses AI. Check for mistakes.
Comment on lines +1186 to +1206
@patch("jupyter_projspec.routes._get_jfs_resource_urls",
return_value=["s3://bucket/prefix"])
async def test_query_param_injection_blocked(self, _mock_jfs, jp_fetch):
"""A URL with injected query params matching an allowed URL must still pass
the allowlist (query params stripped) and must NOT forward those params."""
# The handler finds the match and uses the clean server URL, so it won't
# 403. It will proceed to scan and fail (s3 needs real credentials),
# but the important assertion is no 403 and no 500 crash.
with pytest.raises(Exception) as exc_info:
await jp_fetch(
"jupyter-projspec", "scan-url",
method="POST",
body=json.dumps({
"url": "s3://bucket/prefix?evil=creds",
}).encode(),
)
code = exc_info.value.response.code
assert code != 403, "Should not 403 — URL matches the allowlist"
# 500 is acceptable here: the handler correctly passed the allowlist
# and attempted a real S3 scan (no credentials in test env), confirming
# injected query params were discarded and the clean URL was used.
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_query_param_injection_blocked currently triggers a real scan of an s3://... URL (only allowlist is patched), which risks slow/flaky tests due to network/credential/provider behavior. Patch _scan_url (or projspec.Project) in this test to fail fast deterministically while still asserting that the allowlist match succeeds and injected query params are discarded.

Copilot uses AI. Check for mistakes.
Comment on lines +515 to +519
"""Run projspec.Project() in a worker thread (blocking I/O safe).

Uses the shared _executor. This does not compete with make commands
because make is only available for local paths (the UI disables make
buttons for jfs sources), so there is no thread-pool starvation risk.
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ScanUrlRouteHandler, _scan_url uses the shared _executor and the docstring claims it “does not compete with make commands”. That’s not true in practice: users can still run local make requests while jupyter-fs scans are in-flight, and remote scans can be long-running (network I/O), potentially consuming threads and delaying/queuing make work. Consider using a dedicated executor for scan-url (or a tighter max_workers / separate concurrency limit) and update the comment accordingly.

Suggested change
"""Run projspec.Project() in a worker thread (blocking I/O safe).
Uses the shared _executor. This does not compete with make commands
because make is only available for local paths (the UI disables make
buttons for jfs sources), so there is no thread-pool starvation risk.
"""Construct a projspec.Project for the given fsspec URL and return its dict.
This function is intended to be run in a worker thread / executor by the
caller, since it may perform blocking I/O.

Copilot uses AI. Check for mistakes.
@fsspec fsspec deleted a comment from Copilot AI Feb 26, 2026
@fsspec fsspec deleted a comment from Copilot AI Feb 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants